Morphological Predictability of Unseen Words Using Computational Analogy
نویسندگان
چکیده
We address the problem of predicting unseen words by relying on the organization of the vocabulary of a language as exhibited by paradigm tables. We present a pipeline to automatically produce paradigm tables from all the words contained in a text. We measure how many unseen words from an unseen test text can be predicted using the paradigm tables obtained from a training text. Experiments are carried out in several languages to compare the morphological richness of languages, and also the richness of the vocabulary of di↵erent authors.
منابع مشابه
A Joint Model for Word Embedding and Word Morphology
This paper presents a joint model for performing unsupervised morphological analysis on words, and learning a character-level composition function from morphemes to word embeddings. Our model splits individual words into segments, and weights each segment according to its ability to predict context words. Our morphological analysis is comparable to dedicated morphological analyzers at the task ...
متن کاملUnsupervised Learning of Morphology
some morphological pattern that recurs among the groups. Such emergent patterns provide enough clues for segmentation and can sometimes be formulated as rules or morphological paradigms. (c) Features and Classes: In this family of methods, a word is seen as made up of a set of features—n-grams in Mayfield and McNamee (2003) and McNamee and Mayfield (2007), and initial/terminal/mid-substring in ...
متن کاملIdentifying Broken Plurals, Irregular Gender, and Rationality in Arabic Text
Arabic morphology is complex, partly because of its richness, and partly because of common irregular word forms, such as broken plurals (which resemble singular nouns), and nouns with irregular gender (feminine nouns that look masculine and vice versa). In addition, Arabic morphosyntactic agreement interacts with the lexical semantic feature of rationality, which has no morphological realizatio...
متن کاملUnsupervised Learning of Morphology Without Morphemes
The first morphological learner based upon the theory of Whole Word Morphology (Ford et al., 1997) is outlined, and preliminary evaluation results are presented. The program, Whole Word Morphologizer, takes a POS-tagged lexicon as input, induces morphological relationships without attempting to discover or identify morphemes, and is then able to generate new words beyond the learning sample. Th...
متن کاملUnsupervised Morphological Analysis by Formal Analogy
While classical approaches to unsupervised morphology acquisition often rely on metrics based on information theory for identifying morphemes, we describe a novel approach relying on the notion of formal analogy. A formal analogy is a relation between four forms, such as: reader is to doer as reading is to doing. Our assumption is that formal analogies identify pairs of morphologically related ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016